Comments-Oriented Document Summarization Based on Multi-aspect Co-feedback Ranking

نویسندگان

  • Lifu Huang
  • Hongjie Li
  • Lian'en Huang
چکیده

With the popularity of Web 2.0, comments left by readers on web documents have drawn much attention. In this paper, we study the problem of comments-oriented document summarization, which aims to summarize a web document by considering not only its content but also the comments. Generally, most of the comments usually convey one or a few aspects of the document. Given a sentence set from both the web document and its corresponding comments to summarize, we can divide different sentences into different clusters (named “aspects”) according to the content. It is challenging and interesting to summarize the web document based on these clusters. Motivated by this, we propose a novel model: MultiAspectCoRank, for comments-oriented document summarization. Firstly we rank all the sentences based on the multiple aspects obtained from the whole document, and then provide each ranking list as feedback to others until the top-N results of each ranking list are unchanged. We get the final result by integrating these different ranking lists together. Experimental results on a set of real-world blog data with manually labeled sentences show the promising performance of our

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generating Aspect-oriented Multi-Document Summarization with Event-aspect model

In this paper, we propose a novel approach to automatic generation of aspect-oriented summaries from multiple documents. We first develop an event-aspect LDA model to cluster sentences into aspects. We then use extended LexRank algorithm to rank the sentences in each cluster. We use Integer Linear Programming for sentence selection. Key features of our method include automatic grouping of seman...

متن کامل

Generic Multi-Document Summarization Using Topic-Oriented Information

The graph-based ranking models have been widely used for multi-document summarization recently. By utilizing the correlations between sentences, the salient sentences can be extracted according to the ranking scores. However, sentences are treated in a uniform way without considering the topic-level information in traditional methods. This paper proposes the topic-oriented PageRank (ToPageRank)...

متن کامل

Decayed DivRank for Guided Summarization

Guided summarization is essentially an aspect-based multi-document summarization, where aspects can be taken as specified queries in summarization. We proposed a novel ranking algorithm, Decayed DivRank (DDRank) for guided summarization tasks of TAC2011. DDRank can address relevance, importance, diversity, and novelty simultaneously through a decayed vertex-reinforced random walk process in sen...

متن کامل

iDVS: An Interactive Multi-document Visual Summarization System

Multi-document summarization is a fundamental tool for understanding documents. Given a collection of documents, most of existing multidocument summarization methods automatically generate a static summary for all the users using unsupervised learning techniques such as sentence ranking and clustering. However, these methods almost exclude human from the summarization process. They do not allow...

متن کامل

Reader-Aware Multi-Document Summarization: An Enhanced Model and The First Dataset

We investigate the problem of readeraware multi-document summarization (RA-MDS) and introduce a new dataset for this problem. To tackle RA-MDS, we extend a variational auto-encodes (VAEs) based MDS framework by jointly considering news documents and reader comments. To conduct evaluation for summarization performance, we prepare a new dataset. We describe the methods for data collection, aspect...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013